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At>stract 

In Corynebacterium glutamicum the LysE carrier 
protein exhibits the unique function of exporting L- 
lysine. We here analyze the membrane topology of 
LysE, a protein of 236 amino acyl residues, using 
PhoA- and LacZ-fuslons. The amino-termlnal end of 
LysE is located in the cytoplasm whereas the carboxy- 
terminal end Is found in the periplasm. Although 6 
hydrophobic domains were Identified based on 
hydropathy analyses, only five transmembrane 
spanning helices appear to be present The additional 
hydrophobic segment may dip into the membrane or 
be surface localized. We show that LysE is a member 
of a family of proteins found, for example, in 
Escherichia co/i, Bacillus subtilis, Mycobacterium 
tubercuiosis and Heiicot^cter pylori. This famliy, which 
we have designated the LysE family, is distantly related 
to two additional protein families which we have 
designated the YahN and CadD families. These three 
families, the members of which exhibit similar sizes, 
hydropathy profiles, and sequence motifs comprise 
the LysE superfamily. Functionally characterized 
memt>er8 of the LysE superfamily ex|x>rt L-lyslne, 
cadmium and possibly quarternary amines. We 
suggest that LysE superfamily members will prove to 
catalyze export of a variety of biologically imj^ortant 
solutes. 

introduction 

An unusuai bacterial transport protein was recently 
characterized biochemically, physiologically, and 
moleculariy (Broer and Kramer, 1991; Vrfjic ef a/., 1995; 
1996). This carrier Is LysE from the Gram-positive 
bacterium Corynebacterium glutamicum. It exports L-lysine, 
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thereby regulating the Intracellular concentration of this 
amino acid. The need for this function is surprising as the 
L-lysine biosynthetic pathway in C. glutamicum, as in other 
bacteria, is strictly regulated (IMakayama, 1985; Eggeling, 
1994). IHowever, L-tysine export is' essential In 
environments containing peptides; in the presence of low 
concentrations of lysine-containing peptides and upon 
deletion of the export carrier gene tysE, exceptionally high 
cytoplasmic concentrations (>1M) of L-lysine build up 
leading to bacteriostasis. The specific export function 
provided by LysE is also a prerequisite for the production 
of L-lysine with C. glutamicum, mutants of which are used 
for the prodiwtion of at>out 3.5x1 ^tons of this amino acid 
per year (Leuchtenberger, 1996)n VsE therefore provides^ 
a new tuncoon in regulating a cycqplasmic amino acid con- 
centration, and ft is the first example of export as a target 
, for the improvement of indusfrgl amino acid productiori ^ 

In addition to tf le unique pnysiobgical function of LysEj 
its structure is unusual. The carrier consists of 236 amino 
acyl residues (Vrtjic et at,, 1996), and it exhibits six 
hydrophobic domains that could correspond to six 
transmembrane helical spanners typical of many polytopic 
membrane transport proteins. LysE lacl^s significant 
sequence similarity to known export translocators which 
are constituent members of the Identified 12 families 
catalyzing efflux of d-ganic motecufe^;and cations (Saier, 
1994; Saier etai, 1994; Paulsen ef at. 1996). It therefore 
represents a novel ^tly of prptetns^tinct from a^ethec 
_g$tablished famil i es o f transporters (Vrij te ef a/., 1 996)| The 
"oirreni vemabie tloodDf genome sequencing activities has 
resulted in the realization that nutinerous putative LysE 
homok:)gues are encoded on bacterial chromosomes and 
plasmids. LysE evidently represents the tip of an iceberg 
of a novel class of transporters whch may exhibit unique 
physiological functions analogous to that of LysE. 

In the present paper we report investigatior^s into the 
membrane topok>gy of LysE using the well established 
methodok)gy of alkalirte phosphatase and B-^alactosidase 
fusion constnjctions. Additionally, we searched the DNA 
and protein databases for the presence of LysE 
homologues. A thorough analysis revealed that LysE is a 
member of a novel superfamily of carriers that consists of 
three discrete families. The functionally characterized 
members of this superfamily exclusively catalyze solute 
export. 

Results 

The LysE Famliy 

Currently sequenced members of the LysE family are 
presented in Table 1 . The LysE protein of C. glutamicum is 
the only functionally characterized member of this family. 
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Table 1 . Protein Members of the LysE Family 



Abbreviation Name or description in database 



LysECgl 
Orfl Mtu* 
Orf2Mtu 
YggA Eco 
YggAAsa* 
YggAAhy* 
OrtBsu 
OrfHpy 



Organism 



Size (no. residues) Database and accession no. 



Lysine exporter protein 

Hypottietical 20,9 KD protein CY20G9.14 

Hyix)thetical 20.8 KD protein CY39.33C 

Hypothetical 23.2 KD protein in SBM-FBA tntergenic region 

Lysine export homologue 

YggA homologue gene product 

Hypotheticat protein YtsU 

Conserved hypotheticaJ integral n^embrane protein 



Corynebactorium dutamhum 236 

Mycobscterivm tuberculosis 21 2 

Mycobacterium tubercukisis 199 

Escherichia coii 211 

Aeromorjas safmonidda 206 

Aeromonas hydrophOa 206 

Badllus subtHis 220 

HoScf^ctQr pyiori 210 



gbX96471 
spQ11154 
spQl0871 
spPl1667 
gbU65741 
gbX89469 
gbY09476 
gbAE000585 



* Based on the DNA sequer»ces; the amino acid sequences of these three proteins (used in the reported analyses) were modified from those reported In the 
data base entries as fottows: 

Orfl Mtu: Eleven residues were added to the amino terminus of the sequence reported In the database. The initiation codon codes for vafine 
YggA Asa: Fourteen residues were added to the amino terminus of the sequence reported in the database. The initiation codon codes for methionine 
YgoA Ahy: A probabte franrieshtfl sequencing error in the 3* region of the gene was identified approximately at position 2241 . The correct reading frame was 
deduced by comparison with the aligned homologues. 



Orf Hpy 
YggA Eco 
Orf2 Mtu 
YggA Asa 
Ygga Ahy 
Orfl Mtu 
LysE Cgl 
Orf Bcu 



(1) 
(1) 
(1) 
(1) 
(1) 
(1) 
(1) 



MFWPIBGFGIAISl^aiAVGAQSUIVERGMARNyV 
MFSVYFQGLAIiGAAHrLPLGPQNAFVMNQGIRRQyH 
KNSPLWGPIACPTLIAAIGAQNAFVliRQGIQRBHV 
HFATTLQQFTLGLAMI IPI6AQMAFVLSRGIHRNHH 
HFATTLQGFTLGIiAMI I PI6AQKAFVLSRGIHRKHH 
VPLQVPVGFVAMMTIiKVAIGPQHAFVIAQGIRRBYV 
MVIMBIPITGUJUSASUiSIGPQNVLVIKQGIKREGL 



FLICALCFMCDIVI14 
IMIAUXIAIfiDLVLI 
LPWALCrVSDIVLI 
LIAATLCCLCDLILI 
LLTATLCCUn^LVLI 
LVIVALCX3IADGALI 
lAVLLVCLISDVFLP 



( 1 ) OTISUlIKKKCERSPFSMNAIIHGIVIAPGLILPLGVQNWIFXJQGAJjQKHIW^ 



Conaensus 



M GF-L-— LI — IG-QJIAFV--QGI-R--- L LC---D--LI 



Orf Hpy (52> SMGVFGVGAyPAKiniYI*Sr*SIlJLPGaUiFT6PYAFIAIJCrr^ KKKQV 

YggA Eco (52) CAGIFGGSAIJi«QSPWI*IALVTWGGVAPIXITOPGAFKTAMSSN lELAS 

Or £2 Mtu (52) AAGIAGF6ALIGAKPRALNVVKFGGAAFLIGYGLI.AARRAWRP VALIP 

YggA Asa (52) GIGVPGGANLLAASPIGIJUJilWGGVLFI/rWFGIRSLI^^ GAAIA 

YggA Ahy (52) GIGVTXSGANIJJ^PIGLALLTWXSVLPLGWFGIRSIdRSAWR^ QAKI-A 

Orfl Mtu (52) AAGVGGPAAIlIHAHPN^r^LV7U^PGGAAPLIGYAiLAARNAWRP S6LVP 



LysE Cgl (54) IAGTLGWDLLSNAAPIVLDIMRWGIGIAYLLHFAVMAAia)AMTNKVEAPQIIEB^ 
Orf Bsu (71) VLAVAGVSVIVQBLPVPBTVMMAGGPLFLLYMG WVTWNIR PNTSQ 

Consensus --GV-G L--A-P--L GG--PL 6 — A AW ■ 



Orf Hpy 
YggA Eco 
Orf2 Mtu 
YggA Asa 
YggA Ahy 
Orfl Mtu 
LysE Cgl 
Orf Bsu 



(101) 
(101) 
(100) 
(101) 
(101) 
(100) 
(124) 
(IIC) 



QTPKKLSLKKT 
AEVMKQ6RHKI 
SGATPVafOiABV 
DSPRLMGVKSV 
DSPQLMGVKSV 
SESGPAALXOV 



LLFTLGWLLNPQVYLEMVPLIGASALSFNLAQKFVFliAGTLSAALSW 
lATMI^VTWIJIPHVyLETPVVLGSLGGOLDVEPKRWFALGTISASFLW 
LVTCAAFTFIiWPHVTfLDIWLLGAIANE HSDQRWLFGLGAVTASAVW 
LAMTLGVTLmPHVYLDTLMLLGSPGSQPAEPLRPAPAAGAMLASLVW 
LAMTLGVTLrJJPBVYLDTIMLLGSPGSQFAEEIiRSAPAAVAMlASLVW 
V(»4CLVVTFI^PHVYUDTWLIGALANK ESDLRWFPGAGAWAASWW 



TiniWIRVRVBVSVDKQRVWVKPMI/lAI VLTWUIPNAYU>A PAAGAPAASLI W 

NEKHTFTPKKO AAFAAAVSLUIPHAILDTIGVIGTSSLQYSGLEKWLFMAACIAVSWXW 



Consensus 



. - -L-VT-LHPHVYLDT- -L-G Q R- -P-AGA- -AS- -W 



Orf Hpy (160) LLLLCTl^LRYGSKLLNKQKIFMGVNLFVTAIMCn'L^ 

YggA Eco. (160) FFGIALLAAWLAPR LRTAKAQRIINLVVGCVMWFIALQLARDGIAHAQALFS 

Orf 2 Mtu (15B) FATLGFGAGRL RGLFTNPGSWRIUXSLIAVMMVALGISLTVT 

YggA Asa (160) FYSLAFGAAALSPW . LARGRVQQAIDTIVGLIMLGLALQLASG ALLAS 

YggA Ahy (160) FYSLAFGAWLSPW LARSRQQQAIDTIVGFTMLGLALQLASG ALLAS 

Orfl Mtu (158) PAVLGPSAGRLQPP PATPAAWRILDALVAVTMIGVAVWLVTSPSVPTANVALII 

LysE Cgl (194) FPLVGFGAAALSRP LSSPKVWRWINVWAWKrAIiAIKLM LMG 

Orf Bau (175) FISIAI AGRLFQTIDTSGRLMLIVNKCSAAVMWAAAGYFGV SLFCN 

Consensus F--L-F-A--L L V M--LA--L -L 



Rgure 1. Complete multiple alignment of the sequences of the proteins that comprise the LysE family. The TREE program of Feng and Doolfttle (1990) was 
used to derive the multiple alignments shown here and In subsequent figures. Asterisks above the al^nment indicate fully consenred residues while residues 
conserved lr> a majority of the proteins are presented in the consensus sequence below the alignment Residue numbers in each protein are provided in 
parentheses following the protein abbreviation at the beginning of each line. Abbreviations of tt»e proteins are as indicated in Table 1 . 
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The eight proteins found are of about the same size (199- 
236 residues) with the C, glutamicum protein being the 
largest. Probable sequencing errors and incorrect choices 
of initiation codons were detected in the genes encoding 
proteins from the two Aeromonas species as well as 
Mycobacterium tuberculosis (Orfl). and these led to 
incon^ect protein sequence entries in the databases. These 
sequences were altered in accordance with the sequence 
of the characterized LysE protein and predictions based 
on the complete multiple alignment. The sequences used 
are shown in the alignment presented in Figure 1 . 

This alignment served to generate the average 
hydrophobicity plot shown In Figure 2A. Sbc peaks of 
hydrophobicfty were obsen/ed with the potential to form 
membrane spanning a-helices. All of these regions occur 
without gaps in the multiple alignment with the exception 
of the divergent sequence of B. subtills which exhibits a 
four-residue gap near the end of hydrophobic region #3. 
As transmembrane hydrophobic regions are usually better 
conserved thsui loop regions of integral membrane proteins, 
the results are consistent with the suggestion that most of 
these regions are transmembrane. 

The largest hydrophllic loop of LysE present between 
predicted hydrophobic regions #3 and #4. This loop is 
unique to LysE of C. glutamicum, since it is larger than 
that of its homologues and since it exhibits a high 
percentage of V and R residues (45%) as follows: AV AT 



DTaNBYBVEVSVDKQBy.TheVsand Rsare 
underlined^ and alternating hydrophilic residues are 
presented in bold face. Similar highly charged repeat 
sequences are found in loop regions of other transporters 
(Paulsen and Saier, 1997; Eng etsd,, 1998). 

Members of the LysE family show striking similarity 
throughout their lengths (Rgure 1). As iridicated by the 
asterisks, 16 residues are fully conserved. AH but one of 
the six fully conserved glycyl residues are located in the 
first half of the alignment. The most conserved r6gk>ns are 
present in hydrophobfc regions #1, #2 and #4. These 
regions served for the derivation of signature sequences 
for the LysE family proteins, as follows: 

1) (SAP) (UV) G (PAV) Q (NS) (UVA) (FL) (UV) (UVMF) X (OR) G 

2) (OA) X (UVM) (SAC) D (UVTG) (UVFA) L (LIVMF) Xa (GA) Xj q 

3) (CATM) (LIVA) (GAV).(UVF) (TS) (WFL) L N P (HNQ) (AV) (Yl) L (DE) 

(X = any residue; aftemative residues at any position are 
indicated in parentheses). 

The phyiogenetic tree for the LysE family proteins is 
shown in Figure 3A. As expected for closely related 
orthotogues. the two Aeromonas proteins cluster tightty 
together. The two M. tuberculosis paralpgues are also 
similar in sequence, suggesting that these two proteins 
arose by a late gene duplteation event that occured after 
Gram-positive bacteria diverged from Gram-negative 
bacteria. It is noteworthy that phylogenetic distances 
separating the proteins do not correlate with the 




a 
E 

X 



1 
0 

-1 

-2 J 





11 


61 


Adi/ 


1 1 1 — ■ r- 

161 


211 



Alignment Position 



Rgure 2. Average bydrophobicrty plots for (A) the LysE family, (B) the YahN family, and (C) the CadD family. The average hydrophobicity plots were based on 
hydropathy values of Incfivtdual amino acids as described by Kyte and Dooftttle (1 982). 
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Rgure 3. Phytogenetic trees for (A) the LysE family. (B) the YahN family, 
and (C) the CadD famUy. The trees were constructed using the TREE 
program (Feng and Doolittle, 1990; Saier, 1994). Branch lengths are 
approximately proportionat to phyiogenotic distance. Abbreviations for ttte 
proteins are as mcficated in Tables 1-3. 

phylogenies of the organisms. This fact suggests that the 
LysE family includes paralogues of dissimilar substrate 
specificities and functions. 

Analysis of LysE-Fusion Proteins 

The predicted positions of the 6 hydrophobic regions within 
LysE of C. gfutamicum, based on the hydropathy analysis 
of the LysE family (Figure 2A) are shown in Figure 4 and 
numbered #1 to #6. Their putative start and termination 
points are also specified. Since lysE can be functionally 
expressed in E. coti (not shown), the well established 
alkaline phosphatase gene fusion technique was used to 
assay for the disposition of the polypeptide chain in the 
membrane (Manoil and Beckwith, 1986). Twenty two 
fusions were made to carboxy temiinally tmncated LysE 
molecules. As expected, assay of the respective 
recombinant E. co//CC118 strains canrying the tysE"phoA 
gene fusions gave either a blue or a white phenotype on 
LB plates containing the chromogenic alkaline phosphatase 



substrate 5-bromo-4-chloro-indolyi phosphate. The data 
for the quantitative assay of alkaline phosphatase specific 
activity are given in Figure 4. Fusions at positions 99. 11 1 , 
112, 113, 114. 115, 120. 130. 143. 154, 164, 183, and 211 
(numbers correspond to the last amino acid of LysE in the 
LysE"PhoA polypeptide), resulted in nearly no alkaline 
phosphatase activity, comparable to the control containing 
the leaderless PhoA polypeptide. High activities resulted 
with all fusions made In the amino temninal parts of the 
protein as well as to locattons 171 and 236 of LysE. In 
addition to the phoA fusions, lacZ fusions were also made. 
However, sequencing revealed that the conect constructs 
were only present in 2 of the 5 fusions produced. These 2 
fusions at position 111 and 211 of LysE resulted In a blue 
phenotype in E. colt CC118 with 5-bromo-4-chloro-3- 
indolyl-6-p-galactoskle as substrate. They thus exhibit 3- 
galactosidase activity at fusion sites where the 
corresponding PhoA fusions did not cause any activity. 

In order to establish that the LysP'PhoA fusion proteins 
had been synthesized and were present. Western blot 
analyses were conducted (Figure 5). The observed sizes 
effusion proteins 111. 112, 114, and 115 are similar to the 
predkrted sizes (61.9 KDa - 62,5 kDa). However, with the 
other fusions, visible quantities of the full-length fusion 
proteins were not detected, and the intensities of the PhoA- 
antibody positive proteins did not correlate with the PhoA 
specific activities. For instance, although fusions 59, 67, 
and 75 resulted in high specific PhoA activities (Rgure 4). 
they did not result in visible proteins of the expected size 
of 65.5. 56.5, and 57.6 kDa. respectively. This suggests a 
low steady state level of the membrane-inserted fusion 
protein, possibly due to its susceptibility to rapid 
degradatk>n. The instability of topology probe fusions has 
been documented previously (Sarsero and Pittard, 1995). 
Due to the degradation products present in the extracts 
analysed and the increased size observed for these 
proteins with increasing fusion-length, we suggest that 
these fusion proteins were synthesized and that the low 
activities observed were not due to poor expression. 

LysE Topological Model 

The high alkaline phosphatase activity observed with PhoA 
fused to the last amino acyl residue 236 suggested that 
the cartx)xyl erKj of the LysE polypeptide is localized to 
the periplasm. This fact determines the orientation of 
hydrophobic region #6 (Rgure 6). In accordance with this 
proposed orientation is the fact that alkaline phosphatase 
is not transkxated into the periplasm when fused at position 
211. whereas B-galactosidase fused at this position is 
active. The predk;ted k>op region between hydrophobic 
regions #5 and #6 (loop 5:6) is therefore directed towards 
the cytoplasm. The PhoA fusion at position 171 is blue, 
localizing loop 4:5 to the periplasmic side. The white PhoA 
fusions at positions 99-130 in loop 3:4 and the blue LacZ 
fusion at position 1 1 1 place this unusually large loop of the 
LysE polypeptide in the cytoplasm. We therefore assign 
the amino terminal end of hydrophobic region #3 to the 
periplasmic side of the membrane. In accordance with this 
orientation is the translocation of PhoA when fused at 
positions 67 and 75. This assignment in the carboxy 
terminal part of LysE fits with a predicted earner model 
exhibiting alternating loops and transmembrane spanning 
helices where each of the four hydrophobic regions #3 to 
#6 spans the membrane once (Rgure 6). 
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Figure 4. Hydrophobic regions of the C. ghrtamicum LysE protein and alkatrno phosphatase activities of ceHs expressing LysE'PhoA fusion proteins. At the 
top of the figure, the predicted heltctes of LysE are given together with their start and erxJ points. The bats in the lower part of the figure ^ve the specific 
alkaftne ptto^^tiatase activities in pmol/mfn/mg (protein). They are located at the fusion sites of LysE with the sequence numt)ering given <xi the x-axis. 
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Rgure 5. Western blot analysis of cells expressing lysE'phoA fusions. Cell 
extracts were separated on a 10% SDS potyacrylamide gel, and proteins 
were probed using antibody directed against PhoA. The in(fividual fusion 
products are given with numt>ers indicating the last LysE-specific amino 
acid In the fusion protein. C+, extracts of E. coff CC1 1 8 pPA4 encoding pre- 
PhoA. C-. extract from the plasmidtess strain. Pre-PhoA protein is marlced 
by an open anow head, arxl mature PhoA by a closed arrow head. S 
represents the protein standards with molecutar weights in kifodaltons. given 
on the right 



Ffgure 6. Models for the two-dimensior»al topology of LysE of C, glutamlcum. 
Models were predicted from (A) the average hydropathy plot, and (B) the 
PhoA/LacZ fusion analyses. The flags are located acoorc^ to the assigned 
heiices as given in Rgure 4 and according to the fusion point The dart< and 
white flags indicate PhoA fusions resulting in high and low alkaline 
phosphatase activity, respectively. The two LacZ fusiorts obtained exhibiting 
high (3-galactoside acthnty are given as hatched flag$. 
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Table 2. Protein Menibers of the YahN Fanwty 



Abbreviab'on 



Name or description in database 



Organism 



Size (no. residues) Database and accession no. 



YahN Eco Hypothetical 24.8 KD protein in betT-prpR intergenic region 

YlgJ Eco Hypothetical 22.5 KD protein in recQi>ldB intergenic region 

Yd07Hin Hypothetical pmtein HI1 307 

YcaR Pae Hypothetical 23.3 KD protein in carA-carB intergenic region 

YeaS Eco Hypothetical 23.2 KD protein In gapA-md intergenic region 

YrhP Bsu Hypothetical protein in a^pA-surV intergenic region 

Yg27 Syn Hypotheticai 22.0 KD protein Srr1627 

MTgA Sco* MigA protein 

YigK Eco* Hypothetical 15.4 KD protein in recQ-pkJB intergenic region (F1 38) 

YnK Eco Hyjxithetical 21 .2 KD protein In srmB-ung intergenic region 



Escherichia coii 223 

Escherichia coii 206 

Haemophilus influenzae 210 

Pseudomonas aewginosa 21 6 

Escherichia coH 212 

Badllus suhtiiis 210 

Synechocystis PCC6a03 206 

ShewaneBa cotweifiana 206 

Escherichia cdi 204 

Escherichia coii 1 95 



spP75693 
spP27846 
spQ57320 
spP381C2 
spP76249 
gbZ99117 
SpP74343 
pFrS23222 
spP27847 
spP38101 



*The sequences of these two proteins have been modified from those reported in the databases, based on the DNA sequences and the complete multiple 
alignment for the YahN family, as follows. MIgA: 24 residues were added to the amino terminus, and 32 residues were added to the Otemninus. These 
changes reflect (1) an incorrect selection of initiatton codon. and (2) a sequencing error in the 575 nucleotide sequence region as reported under acc# 
gbX67020, respectively. YloK: 68 residues were added to the amino tenrninus. Tliese changes reflect (1) an incorrect selection of initiation codon, and (2) a 
sequencing error in the 4092 nucleotide sequence region as reported under acc# gbU00096 (AE000458), respectively. 



YigJ Eco 
YigK Eco 
YcaR Pae 
Yg27 Syn 
YahN Bco 
Yeas Bco 
Yd07 Hin 
MlgA SCO 
YrhP Bsu 
YfiK Eco 



tX) 
(1) 
(1) 
(1) 



(1) 
(1) 
(1) 
(1) 
(1) 



MLMLFLTVAMVH 
MTLEWMFAYLLTS 
MPKjSITDFWTYVLGV 
KLTHQWSNI LSLFGAM 



IV7ir>«SPGPDFFFVSQTAVSRSRKEAMMGVIiGITCGVMVWAGI 
1 1 LSLSPGSGAINTMTTSLNHGYRGAVAS lAGLQTGLAI HI 
VFVIUJ»GPNSLFVIAtSAQRGVATGY3UACGVFLGDA^^ 
LIIAALPSLSVLTVSSKSASGGFIHGLFAALGVVLGDI 1 Fl LI 



(1 ) ^^MQ^*VH^*FMDBITMDPLHAVYLTVGl•FVITFFNPGANLFVWQTSL^^ 



MPABYGVUnfWTYLVGA IPIVLVPGPNTIJVUaaSVSSGMKGGrnJUVCXSVFld^ 

>«LNLIIVH LFGLMTPGPDFFYVSRMAASNSRRNTVCGILGITXiGIAFWGML 

MSKHILIAVPLPT FPFVSITPGMCMTIJOTLGKSIGVRRTL»BWVGEIJU3^ 

MHSIXAYI PXA Mmvi I PGADTWiVMKOTlJiyGPKAGRYNILGIATCLSPTrrvI 

MTPTLLSAFWTYT LITAm'PGPNNIUU^SSATSHGFRQSTRVlAaiSIXSPLIVMLL 



Consensus 



-PG- 



YigJ BCO (56) AULXSUnillEWOVWUrrLIMVOGGLYLCWMGYQMLRGA L 

YigK Bco (55) VLVGIjGTLFSRSVIAFEVIJCWAOAAYLIWLGIQQWRAA G 

YcaR Pae {60) SALGVASLLKAEPHLFIGUCYLGAAYLFYI/nrGKUUSAHR 

Yg27 Syn (60) ALWGLAFLRGAMSDPPVIUCYISGIYLSWLGIKTIRA 

YahN Eco (71) GLFGLATLITQCEEIPSLIRIAraGAYLLWPAWCSMRRQ S 

YeaS Eco (61) AWAGVATLIKTTPILFNIVRYLGAFYIiLYIiGSKILYAT L 

Yd07 Hin (53) SMLGJ;iAVLFVTlPAIiKGVIMLI/5GSYLAYU;FlilARSK^ 

HlgA SCO (58) AVMGVASMMLNYPQLFDlLKWVGGLYLGYIGISKWRA 

YrhP Bsu (55) AIIX3LSWIAKSVILFTTIiCYIiGAAYLiyLGVKSFFAKSMFSIJ5DMQSOAI^^ 

YfiK Bco (57) CAGISFSLAVIDPAAVHLIiSWAGAAYrVWLAWKIATS PTKEDGLQAK PISFWASFA 



KKEAVSAPAPQVELAKSGRSFLKGLL 
AIDLKSLASTQ SRRHLFQRAVF 
KIJlNPBATAGQAECfVDVHQRFRQALL 
KVNNQSLAKVDVK SLSSSFSAGLL 
TPQMSTLQQPISA PWYVFFRRGLI 
KGKNSBAKSDEP QYGAIFKRALI 
AKPBSHSDTEFNQQrrTlKKBILKGLL 
KGKMANLDNtSSQ.. ISNRALITQGFV 



Consensus 



-~-GLA-L-- 



--LF G--YL--LG R-- 



-GIi- 



Yigj Eco (121) TNLANPKAIIYFGSVPSLFVGDNVG TTARWGI FAL IIVBTIiAWFTWASLFALPQMR RGYQRIAKW 

YigK Eco (116) VNLTNPKSIVFIAALPPQF3MPQQP QIMQYIV LGVTTIWDIIVMIGYATLAQRIALW IKGPKQMKA 

YcaR Pae (126) LSLSNPKAILFFISFFIQFVDPGYAYPGLSFLV lAVILELVSAIiYLSPLIFTGVRLAAKPRRRQRlAAG 

Yg27 Syn (121) ITLADQKAVLFYLCjPLPTFVDVNNI AYLDIAV IILTAILTVGGVKIFYAFLAHRSGLLISRQHK RI 

YahN Eco (134) TDLSNPQTVI^FISIFSVTLNABTP TWARLMAKAGI VLASIIWRVFLSOAFSI.PAVR RAYGRMQRV 

Yeas Bco (123) LSLTNPKAILFYVSFFVQFI0VNAPHTGISFFI LAATtiKLVSFCYLSFLIISQAPVTQYlRTKKKLAKV 

Yd07 Hin (119) VlOiSNAlCVVVYFSSVMSLVLVNlT EKWQIILAPAV IWETFCYFYVISLrFSRNIAK RLYSQYSRY 

MlgA SCO (120J TAXANPKGWAPMISLLPPPISVDQA IAPQLMVU*SIIMMrBFPSKLAyASGGKPLLKLPLSROnNI KW 

YrhP Bsu (125) SNILHPKTVtiVyVTIMPQPINUJGN iKQQLII LASILTLLAVLWFLFLVYIIDyAKKW MKNSKFQKV 

YfiK Bco (113) LQFV»VKIILY6VTAI*STFVI,PQT QAI.SWW GVSVLIAMIGTPGNVCKALA GHLFQRLFRQYGRQ 

Consensus --L-NPK- -LF--S P L r 



YigJ Eco (187) IDGPAGALFAGFGIHLIISR 

YigK Eco (183) LNKIPGSLFMLVGALLASARHA 

YcaR Pae (195) ATSGVGALFVGFGVKIATATLS 

Yg27 Syn (187) MNYLAGALMISVGVFULISS 

YahN Eco (200) ASRVIGAIIGVPALRLIYEGVTQR 

YeaS Bco (192) GNSLIGLMFVGFAARLATLQS 

Yd07 Hin (185) IDNMAGIVFLFPGCVLVYNGINEIIH 

MlgA SCO (188) MNRIAGSLMXCVGLWLALG 

YrhP Bsu (192) FQKITGI ILVGPGIKTGLS Rgure 7. I\^uttiple alignment of ten me^ibers of the YahN family. 

YfiK Eco (178) LNIVIALLLVYCAVRIFY Protein abbreviations are as provided in Tabie 2. The alignment was 

generated using the TREE program. The conventions cf presentation are 

Consensus G-L — FG--L as for Rgure 1. 
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Table 3. Pfptein Members of the CadD Family 



Abbreviation 


Name or description 


Organism 


Size (no. residues) 


Database and accession no. 


CadB Stu 


Cadmium resistance protein 


StaphyfocxKcus lugdunensis 


209 


gbU74623 


CadD Sau 


Cadnrvum resistance protein 


Staphylococcus aureus 


209 


QbU7a550 


QacFBfi 


Putative quaternary amine transporter 


BaciHus firmus 


197 


gb2l7326 


Off Sau 


Unctiaracterized 


Staphytococcus aureus 


219 


gbL10909 



The situation with respect to the amino terminal part 
of LysE, representing about one quarter of the polypeptide 
chain and con^esponding to hydrophobic regions #1 and 
#2, is more complex. In accordance with the experimental 
analysis resulting in assignment of the orientations of 
hydrophobic segments #3 to #6. putative loop 1 :2 should 
be oriented towards the inside (Rgure 6A). However, 
fusions of PhoA at positions 29 and 33 resulted in 
translocation of alkaline phosphatase to the periplasm. 
Thus, the amino terminal polypeptide of 29 amino acyl 
residues can replace the natural hydrophobic leader of 
alkaline phosphatase. Since a lysyl residue ts kx^ated at 
position 30 in LysE, which ts immediately followed by 
anott^r posttivety charged residue (arginyl), two additional 
fusions were made at positions 37 and 40. This was done 
because positively charged residues can assist in the final 
orientation of helibes in the membrane as denru>nstrBted 
for the leader peptidase. Lep, of E. co// (v. Heijne, 1994). 
However, translocation of PhoA was still observed for the 
LysE-PhoAfuskms which included these positively charged 



residues. Therefore, hydrophobic region #1 serves as a 
topotogical detemiinant for PhoA translocationi and the 
putative loop between hydrophobic regkins #1 and #2 is 
directed towards the periplasm. A plausible model based 
on the fusion analyses is presented in Rgure 6B. 

The YahN Family 

All of the proteins of tile LysE family were screened usir^g 
the BL^ST program in an attempt to find additional 
homobgues of tfie established proteins of the LysE family. 
One protein appeared with a poor ^re, This protein is 
Yahfsl of E coiL When YahN was BLASted against the 
databases, it proved to be a member of a family of 
substantial size. The ten proteins klehtifled are listed in 
Table 2. TTiey all are iri the same size ran^ as rhembers 
of the LysE family. Most of the proteir^ of the Y^ihN family 
are from Gram-negative bacteria, five from E. coff. However, 
one of the YahN family rnernbers is from the 6ran>-pGsitive 
bacteriurh B. subtilis (YrhPj, and one is frorh the 
cyanobacterium SynechocysHs PCXi6803 (Yg27). All ten 



Orf Sau 
CadB Slu 
CadlD Sau 
QacF Bfi 

COQBensuB 



(1) MKKSPIPKIFGDDKMIATILTATAVWATQIDYLVILILLFSQVK KG QVKHIWIG 

(1) MRCMyDSDYCN CCYLYIAOAIiDLLVILLMFPARAKTRK EYRDlYIG 

(1) MRCIMIQTWA AAVLYlATAVDLLVILLIFPARAKTRK EYRDIYVG 

U) * MUtiFSQAKAQVKTSKGLKGNQTENKHiSPKDIIIG 



--LY-A---D-LVII.- 



-DI~-G 



***** * *** 



orf Sau (5€) QYIGTAIVIGASIiLVAQGVVNLIt>QQWVIGLLGIJLPLYIiGVKIWIKGEEDEDES 

CadB Slu (47) QYVGSVALIVISLFFA FVLNYVPEKWILGLLGLIPIYLGiKYAIYGDSDGfiERAKK 

CadD Sau (47) QYLGSIILILVSLPIA PVLtm^EKWIZjCLLGLIPIYLGIKVAIYDIXrEGEKftAKK 

QacF Bfi {36) QYLGfTLLVWSLLGT FGVMLIPEKHV GLLGLIPIYLSIMLPIKOKDKDEMAILS 



Consensus 



QYriG---LI--Sli--A-P-VN--PEKW--GLIiGIj:PIYIjGIK-*I-G E- 



Orf Sau 
CadB Slu 
CadD Sau 
QacP Bfi 

Consensus 



(113) SLFSSGKFNQLFLTMIFIVLAS SADDFSIYIPYPTTLSMSEIFIVTIVFLIMVGVL 

(103) ELNEKG LSKLVGTIAIVTIASCGADNIGLFVPYFVTLSVTNLLITLFVFLlIiIFFL 

(103) ELNEKG LSKLVGTVAIVTISSCmDNlGLPVPYFVTLSVTm:;IATrJ?V^ 

( 91 ) SL NSGKYNSVFLSVAFITFAN GGDNIGI YVPFFSTLTMNQIAVTVITFFIMVAIW 



-L--'r-A— T--S-GADillG--VPYF-TLS L--T--VFLI L 



Orf Sau (169) CYVSYRLASFDFISETIEKYERWTVPIVFIGLGlYrLFENGTSNALISFL L 

CadB Slu (159) VFAAQKLANIPEVGEIVEKFGRWIMAVIYIALGLFIIIEMDTIQTiriGPI F 

CadD Sau (159) VFTAQKLANIPGIGEIVEKFSRWIMAIIYlALGLFIIIENDTIC?riLGPl F 

QacF Bfi (146) CFIGYRIJUVFKHVSETLENYGRWIIPrVPIGIiGIYIJWEMBTFSALLSLINY 

Consensus -F LA E--EK--RWI--I--r-LG--I--EN-T L-PI - 



Figure 8. Multiple alignment of the 
CadD family. Protein abbreviations 
are as described In Table 3. and 
the conventions of presentation 
are as indicated in Figure t. 
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of the YahN famity members were Br€aictedTo~be>related 
to the LysE family proteins based WPSIiBLASJ^esults 

(Altschu! et at., 1 997). 

A multiple alignment of the ten members of the YahN 
family is shown m Figure 7. Only four residues are fully 
conserved, revealing a very significant degree of sequence 
divergence among these proteins. The average similarity 
analysis (not shown) revealed that hydrophobic regions 
#2, #3, #4 and #6 (see Figure 2B) are all well conserved. 
Hydrophobic region #4, which is less hydrophobic than 
hydrophobic regions #2. #3 and #6. is the most strongly 
conserved portion of YahN family proteins (compare Rgure 
7 with Figure 2B). The phylogenetic tree for the ten 
members of the YahN family analyzed revealed little 
clustering (Figure 3B). Based on the recognized 
phylogenetic relationships of the organisms, we suggest 
that YeaS Eco and YcaR Pae may be orthologues, but 
that YigJ Eco and Yd07 Hin are not. 

The CadD Family 

Results obtained with the PSI-BLAST program suggested 
that the LysE and YahN families are related to a third more 
distant family of proteins that we have called the CadD 
family (Table 3). Two of the four sequenced proteins that 
comprise this family are plasmid-encoded and probably 
function in cadmium resistance (Chaouni et aL, 1996). A 
third member of this family has tentatively been suggested 
to catalyze efflux of quartemary amines (see the database 
entry of this protein). The fourth protein has not been 
examined for function. All four of these proteins are derived 
from Gram-positive bacteria. 

The multiple alignment of the four CadD family 
members is shown in Figure 8. Many fully conserved 
residues are found within this set of proteins, consistent 
with the conclusion that they do, In fact, comprise a distinct 
family within the LysE superfamily. The average 
hydrophobidty plot (Figure 2C) reveals a pattern strikingly 
similar to those noted above for the proteins of the LysE 
(Figure 2A) and YahN (Figure 2B) families. The 
phylogenetic tree for the CadD family (Rgure 30) revealed 
that the two orthologous cadmium resistance proteins 
cluster tightly together while the QacF Bfi and Orf Sau 
proteins are about equidistant from the CadB and CadD 
proteins. Proteins of tf^ CadD family exhibit no sequence 
or motif similarity to other recognized heavy metal ion 
exporter families (Paulsen and Saier, 1997). 

Discussion 

LysE Topological Model 

Sequence analyses of LysE superfamily proteins revealed 
six hydrophobic regions (Figure 2A-C). Our experimental 
fusion protein analyses of the C. giutamicum LysE protein 
showed that hydrophobic regions #3 to #6 each traverse 
the membrane once, placing loops 3:4 and 5:6 towards 
the cytoplasmic side, and loop 4:5 and the distal end of 
hydrophobic region #6 towards the peripiasmic side of the 
membrane (Figure 6). Further experimental results showed 
that the expected an-angement of hydrophobic regions #1 
and #2 of LysE in an altemating transmembrane array is 
not valid. 

Instead, the experimental data obtained from 
LysE"PhoA fusions 29, 33, 37, 40, 59, 67 and 75 place 
these residues in LysE in the periplasm. Limitations of 



topological interpretation for reporter gene fusions near the 
N-terminal end of a protein have been reported. Thus, 
fusions can prevent translocation (Tate and Hendersen, 
1993; Turi< et ai, 1996) or even reverse orientation of a 
membrane spanning helix due to missing positive charges 
(v. Heijne, 1994). Importantly, in both cases fonnation of 
active PhoA is prevented. However, in the case of the 
LysE"PhoA fusions, alkaline phosphatase is active, 
showing that translocation occurred. Therefore, 
hydrophobic region #1 is believed to be a transmembrane 
spanning helix with its carijoxyl end directed towards the 
periplasm. Based on these results we propose that 
hydrophobic region #2 is not membrane spanning. Instead 
it either might be located peripherally on the peripiasmic 
side of the membrane or loop into the membrane. Further 
evidence for an unusual topology for the N-temnina! part of 
LysE is the rather short putative helix #1, which probably 
does not extend beyond the membrane, and the fact that 
loop 2:3 Is not appreciably amphipathic (not shown). 
Although further investigations on the structure at the N- 
terminal part of LysE are certainly necessary, nevertheless 
the 2-D staicture is conclusive for about three-quarters of 
the protein. Recently obtained 3-D structures of membrane 
proteins, such as those of aquaporin (Walz etaL, 1997), 
and of channels (Nelson et at., 1 999) show that regions 
are present which do not simply form transmembrane 
spanning helices or loops and which are mechanistically 
of great significance. 

Possible Functionally Important Residues 

The identification of conserved residues in LysE together 
with a helical wheel analysis (not shown) allows us to 
speculate on residues which might be of functional 
significance. The best conserved region (LNPHVYL) is 
localized to the center of hydrophobic region #4 which is 
proposed to comprise a central part of the translocation 
channel. Neither the hydrophilic residues nor the fully 
consented residues of hydrophobic region #4 are localized 
to one side of the helix. However, an important function for 
the conserved LNP motif can be postulated. The fully 
conserved asparagine within this motif is one helical turn 
away from a consent threonine/serine residue, and these 
polar/semipolar residues could function together as part 
of the transmembrane active site. The location of the prolyl 
residue within this motif is in full accord with this view. Prolyl 
residues in membrane spanning helices of 
t>acteriorhodopsin introduce kink angles of atx)ut 20" to 
position functionally important residues In the three 
dimensk)nal structure (Grigorieff efa/.. 1996). For example, 
ProSO in helix B in bacteriorhodopsin positions an unpaired 
cart)onyl oxygen of Thr46 which forms part of the chsinnel 
(Deisenhoferand Michel, 1989). Importantly, the prolyl and 
threonyl residues in bacteriorhodopsin are separated by 4 
residues as is the case for these residues in LysE. These 
residues, together with the aspartate in hydrophobic region 
#2, could play a role in lysine binding and/or in binding of a 
putative cotransported or countertransported cation. 

The L-lysine exporter of C. giutamicum has been 
characterized as a secondary active transport system 
where lysine translocation is thought to be coupled to H+ 
influx orCH* efflux (Broerand Kramer, 1991). The relatively 
well consen/ed D/E and H residues in hydrophobic region 
#4 are localized near one another on one side of this 
putative helix. The two fully consen/ed aromatic residues 
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of hydrophobic region #5 are localized to the same side of 
this putative hej^c, with the tryptophanyl residue preceeded 
by and one helical turn away from the well conserved 
serine. Finally, hydrophobic region #6 exhibits one fully 
conserved M residue, and a well conserved D/N residue, 
two helical turns away. We therefore suggest that (a) the 
fulty conserved D In hydrophobic region #2, (b) the N and 
T/S residues in hydrophobic region #4, (c) the fully 
conserved aromatic residues in hydrophobic region #5 and 
(iv) the D/N and M residues in hydrophobic region #6 
represent reasonable candidates for residues that might 
line the transmembrane channel. 

A Novel Superfamily of Solute Exporters? 

The intensive sequence analyses reported in this 
communication sen^e to define a novel superfamily (the 
LysE superfamily) of distantly related homologues which 
comprise three distinct families, the L^E, YahN and CadD 
families. The evidence that these proteins belong to a single 
superfamily can be summarized as follows: (i) all of the 
proteins of these three families have essentially the same 
sizes ranging from 195 to 236 amino acyl residues; (it) all 
of these protein families exhibit essentially the same 
average hydrophobicity plots, a fact that suggests that the 
constituent proteins exhibit very similar three-dimensional 
structures; (iii) functioruilly characterized members of the 
LysE superfamily catalyze solute efflux; (iv) proteins of the 
LysE and YahN families exhibit convincing degrees of 
sequence similarity; (v) the PSt-Bl^ST program grouped 
all of the proteins of the LysE, YahN and CadD families 
together in a single superfamily (data not shown); (vi) none 
of the proteins predicted to be included in the LysE 
superfamily could be found in eukaryotes. 

A surprising fact emerges from the data summarized 
in Tables 1-3. Although only one of the LysE family 
homologues is from E. co//, arKl none of the CadD family 
members is from E coli\ frve of the ten YahN proteins are 
from E. colL Why this one bacterium would proliferate so 
nnany YahN paratogues poses an Interesting question. The 
fact that the YahN members of E. coli have not been 
functionally investigated provides an intriguing argument 
in favour of a novel function for these proteins. 



Experimental Procedure 
Fusion Constructs 

Fragments to be Inserted into pPA4 which oontains phoA duvoid of its 
promoter and leader peptide were generated via PCFt using ^PORTIysE 
as a template (Vrljic et a/., 1996). As a sense primer, 5'- 
CTGTCCTGCAGCTTCCATAGGTCACGATGGTG-3' was used which 
anneals at the 5'end of tysE and generates a Psi\ restriction site. Twenty 
two antisense primers were selected along the tysE sequence, generating 
suitable restriction sites for AriaClorfipcFIVtooriablQtheconstnjction 
d inlranw fusions. PGR ampBfied fragments were digested and doned 
into pUC18 before cloning into pPM. The following fusions were made 
(where numbering refers to ttie lost amino acid specific to LysE): 29. 33, 
37. 40. 59, 67. 75, 99. 111. 112. 113. 114. 115, 120, 130, 243, 154, 164, 
171, 183, 211, and 236. The correctness of tfie fusions was oonfimned by 
sequencing (except fusions 111-1 30). After constnx:tion of tf)e ptasmtds in 
E coH DI-t5ocmcr, ptasmids were introduced ^to E co/!f CC11 8 which laclcs 
endogenous alkafine phosphatase activity (Manoil and Beckwith. 1986). 
To make JEscZ fusions the pGP vectors of Haardt and Bremer (1996) were 
used, as weR as the appropriate primers to enable fusions at position 40, 
67. Ill, 171, and 211. The phenciype of the fusions 111 and 211 obtained 
was estimated in E. coffCCIIS on LB plates containing 5-bromo-4^loro- 
3-indolyl-6-D-galacloside (40 (jg/mi). 

Enzyme Acttvtty DetonninaUons and bnmunodetectfon 
of Fusion Protetns 

Single colonies of recornbinarn CC1 1 8 strains were grown overnight at 37*t) 
in Luria-Bertarti mecSum supplemented with 50 pgM cart>enicinfn. Ttiese 
cultures were used to inoculate new cultures which were grown for 3 hours 
to an CD of 1. Cetis were p>ermeat)ilized and used for phosphatase acSvity 
determinations based on an extinction coefftdent of uoa rM>=0.0188/M/cm 
for the chromogenic product 5-t)romo-4-chlofo-(rKloly1 phosphate. For 
Westem4>tot analyses. oAures were grown ktenticaHy. but after han^esttng. 
they were washed with lOmMTrisHQ. 10mMMgSO4(pH8)and(fi$n4ned 
by soniHcation in the same buffer containing 40 pl/ml protease inha^ftor 
Complete™ (Roche). Each sample (150 pg) was sc^»rated by SDS/PAGE 
ar)d blotted onto nitrooeDulose membranes. The PhoAproietn was probed 
with monoclonal antibody DC133A (S'Prime Inc.; USA) and visualized with 
anti-lgG-coupted alkaliiM phosphatase using &^omo-4-ch(oro-indolyl 
phosphate/rtitro blue tetrazoiium as substrate. 

Computer nnethods 

The BLAST and PSI-BI.AST data base search metfKKls have been 
descrft)ed (Altschul etaL, 1997). The GAP program of Devereux etaL (1984) 
was used to estrfriishhonwlpgyafxlde t efrnine relative degrees of sequence ■ 
similarfty. Mult^ afigrunents. as weS as phytogenetic trees were generated 
using the PREALIGN and TREE programs (Feng and Oootittte, 1990). 
Average hydropatfry, smntlarity and amphipathtdty plots based on the TBEE- 
gerierated multiple alignments were produced as described (Kyte and 
Doolittte, 1982; Le ef a/., 1999) using a sliding witkIow of 21 residues. Data 
base sequence entries were corrected as incficated in ttie footnotes to Tables 
1 and 2. 



Note Added in Proof 

After completion of this manuscript, the functions of two 
members of the YahN family were reported (Aleshin, V. V., 
Zakataeva, N.R, and Livshits, V.A. 1999. A new family of 
amino acid efflux proteins. Trends Biochem. Sci. 24: 133- 
135: Zakataeva, N.P., Aleshin. V.V. Tokmakova, I.L. 
Troshin, P.V.. and Livshits, V.A. 1999. The novel 
transmembrane Es(^richia co//proteins involved in amino 
acid efflux. FEBS Lett. 452: 22^222), The two proteins 
are YigK, renamed RhtB, a probable homoserine/ 
homoserine lactone/B-hydroxynorvaiine exporter, and YigJ, 
renamed RhtC, a probable threonine exporter. We have 
therefore renamed the YahN family, the RhtB family. The 
three constituent families of the LysE superfamily are 
therefore (1 ) the LysE family (TC #2. 75), (2) the RhtB family 
(TC #2.76) and (3) the CadD family (TC #2.77) (see our 
web site http://www-biology.ucsd.edu/-msaier/transport/). 
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